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Abstract 

■ This paper describes the analysis of a selected testbed of Semantic Web ontologies, by a SPARQL query, which determines 
| those ontologies that can be related to the description logic 2?£<Vq>, introduced in (J| and studied in [9]. We will see that 
. a reasonable number of them is expressible within such computationally efficient language. We expect that, in a long-term 
Q view, a temporalization of description logics, and consequently, of OWL(2), can open new perspectives for the inclusion in this 
^ | \ language of a greater number of ontologies of the testbed and, hopefully, of the "real world". 

m 

i INTRODUCTION 

In the last years, Semantic Web has increasingly expanded its area of influence. Being an innovative instrument for 
^ | the retrieval of not expressly stored information and a way of organizing concepts and relations by their meaning, 
O ■ it broadens up plenty of horizons for knowledge representation. Though, only a bunch of experts and researchers 
J know that under a suprisingly vast dimension of new features there lies a mathematical and logical structure inside 
(■vj which they fight day by day for the balancing of expressiveness and efficiency. The Description Logics formalisms, 
>- which we will see in SectionQ] are the formal bases for the so-called Web 3.0, that should allow one to automatically 
t*^" . infer (and retrieve) new information from reasoning on "sematicized" knowledge repositories. Some examples of 
families of logics are described in the following, together with a short coverage of arguments such as RDF graphs 
and OWL. 

Section [2] introduces the description logic 2?£<VJ>, showing notable characteristics of expressive power and 
~| . polynomial complexity. In the same section, a more detailed description of the analysis, which may be seen as 
" a conceptual experiment, follows. Our purpose is to show that a good number of real-world ontologies may be 
. related to this family of logics. Results are promising enough to spur us to stay on this path and complement it 
with studies on temporalization of Description Logics and, consequently, of Semantic Web (briefly touched in this 
. i-h . report), which we will approach in the near future. 

^ ; The SPARQL query which was specifically created to assess the membership of some real-world ontologies to 
^ the aforementioned logic is reported in AppendixfAl 
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1.1 Description Logics 

Description Logics (DLs) are a family of formalisms for knowledge representation, built on logic-based se- 
mantics. They are founded on some fundamental ideas: 

• basic syntactic "blocks" are atomic concepts (1-ary predicates), atomic roles (2-ary predicates) and individu- 
als (costants); 

• the expressiveness of a certain language depends on the use of a set of chosen constructors, that give birth to 
complex concepts and roles starting from existing ones; 

• by means of classification of concepts, a subsumption hierarchy is established, which specifies what concept 
includes or is included by another; 
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• implied knowledge is automatically obtained through a logic procedure, called reasoning, essentially based 
on the application of inference rules to subsumption between concepts, and an instance definitions between 
individuals and the latter. 

On the assumption that a Knowledge Representation System {KR-System) is to give an answer to a user query, 
reasoning algorithms for DLs should be regarded as decisional procedures which return a positive or negative 
verdict. This raises the decision problem for such languages. Furthermore, having an answer does not always mean 
getting it in a reasonable lapse of time, and that compels us to consider also the complexity of algorithms at stake. 
Decidability and complexity directly depend on the expressive power of the description logic we use: whereas very 
expressive DLs tend to have inference problems and be computationally hard (or even undecidable), the DLs that 
are more efficient in reasoning turn out to be not espressive enough to represent all concepts and relations in the 
domain of interest. Research still going on in the field of DLs just aims at the ability of balancing expressiveness 
against efficiency, while not dropping semantic precision that could make it applicable to real world situations. 

A Description Logics Knowledge Base {DLKB) is made of two components: TBox {Terminological Box) and ABox 
{Assertional Box). The former contains the vocabulary, i.e., the definitions of atomic and non-atomic concepts 
and roles, called axioms, whereas the latter describes individuals in terms of this vocabulary, i.e. it includes the 
declarations, called assertions, of their instances. By means of the TBox we can name complex descriptions of 
concepts and roles. The language for such a naming is what distinguishes one DL from another and is based on 
a model-theoretic semantics. Thus statements in TBox and ABox can be regarded as first- order logic formulas. 
Reasoning procedures for the terminological part are used to verify the satisfiability of a description (i.e., its non- 
contradictoriness), or whether it may be subsumed by another (i.e., whether the latter is more general than the 
former), while those for the assertive part establish whether its set of assertion is consistent (i.e., it has a model 
or it entails that a certain individual is an instance of a given concept) . Satisfiability tests for descriptions and 
consistency tests for a set of assertion allow one to establish whether the knowledge base is meaningful or not, 
whereas subsumption tests allow one to maintain a hierarchy of concepts ab universali; finally, instance tests give 
one the ability of querying the system against single individuals. 

More formally, a generic DL axiom is a formula of one of the following types: 

• C = D (equivalence between concepts) 

• CCD (subsumption between concepts) 

• R = S (equivalence between roles) 

• R c S (subsumption between roles), 1 

where the symbols C, D are names or expressions of complex concepts, which are formed by 1-ary or 2-ary oper- 
ations on/between atomic (indicated by A in the following) or complex concepts, while R, S are names or expres- 
sions of complex roles which are formed by 1-ary or 2-ary operations on/between atomic (indicated by P in the 
following) or complex roles. 

A generic DL assertion is a formula of one of the following types: 

• C (a) (concept assertion) 

• R(a,b) (role assertion), 

where the symbols a, b are names of individuals, for which the concept C or the role R holds. 2 

From a semantical point of view, an interpretation I is a pair (A 1 , where the non-empty set A 1 represents 
the domain of the interpretation and the interpretation function 1 associates a set A 1 C A 1 to every atomic 
concept A , a relation P 1 C A 1 x A 1 to every atomic role P, and an element a 1 e A 1 to every individual a. We 
write: 3 

• I h (C = D) iff 4 C 1 = D 1 

• 1 h (C E D) iffC 1 C D 1 

• X\=(R=S)ffiR x = S I 

• X \= (R C S)ffiR I C S x 

1 The equivalence and subsumption between roles may be indicated also by the symbols = and C, respectively. 
2 A large number of constructs is listed in the table at the end of this section. 

3 In the following, X |= <f> means "X satisfies 4>, where <f> can be an axiom or an assertion. A syntax/ semantical reference for the main axioms, 
assertions and property declarations is listed in the table at the end of this section. 
4 "Iff" is short for "if and only if". 
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• Z |= C (a) iff a 1 e C 1 

• I\=R(a,b) iff (a, bf e R 1 

Finally, we say that I is a model for a DLKB K (and write I |= K) if I satisfies all the axioms and assertions of 
K . The latter is said to be consistent if there exists at least an interpretation satisfying it, and the search of this 
interpretation {consistency problem) is just the clue to reasoning. 

1.2 Families of logics 

As already observed, distinct DLs are characterized by the constructs allowed to form complex concepts and 
roles starting from atomic ones. The names are usually specified by a series of alphabet letters and symbols. In the 
following, we will not concentrate on formal semantics, but, for the sake of clarity, we will only hint at the meaning 
of some constructs. 

By way of an example, we briefly describe the ACC logic [Attributive Language with Complements) . Its syntax 
obeys the following rules: 

C,D ->A\T\±\-,C\CnD\CL\D\ MR.C | 3R.C , 

where T denotes the concept enclosing any other one [top concept), _L denotes the concept enclosed in any other 
one [bottom concept), -> negates a concept; U represents the union of concepts (notice the analogy with the corres- 
ponding set operators) and n represents the intersection of concepts. The last two constructs are called respectively 
universal and existential restriction, and are pivotal in research connected to this report. The concepts that can be 
built in ACC are called ACC -concepts.The axioms and assertions that may be expressed in the ACC logic are 

C = D, C^D, C(a), 

which indicate, respectively, equivalence and subsumption (also called GCI, General Concept Inclusion) between 
two concepts, and the membership of a concept. However, reasoning in ACC has a PSPACE-complete computa- 
tional complexity. 

In addition to those seen above, the most common constructs for concepts which can be formed in DLs are 
< l.R [functional restriction, denoted T , that is equivalent to 3R.T), < nR and > nR [numerical restrictions , 
J\f , which enclose the functional one), < nR.C and > nR.C [qualified restrictions, Q , that include the numer- 
ical ones), {a} and {ai, . . . , a n } [nominals, O ), and BR.Self [self-concept). Contructs for roles are often denoted 
by an operator symbol inside brackets after the name of the logic. They are R~ (inverse role, 1), R U S, R n S 
and ->R (role union, intersection and complement, respectively), R o S (role composition, also used in chain), R + 
and R* (transitive and reflexive-transitive closure of roles), id (C)[concept identity), and R c \,R\d and Rc\d (role 
restrictions). Sometimes, the symbols U (or V, universal role, defining the role which encloses any other one) and 
N (or A, empty role, defining the role enclosed in any other one) are used. Other kinds of axioms which can be 
introduced are R = S (equivalence between roles), R C S (hierarchy between roles, H , denoted also by R C S), 
and the relative assertion R (a, b) (role instance). The reflexive, irreflexive, symmetric, antisymmetric, transitive, in- 
transitive, disjunctive (two roles having no pair of elements in common), functional and inverse functional prop- 
erties for roles are denoted by symbols that often differ in literature, but are never ambiguous, e.g. respectively 
Sym (R) , Asym (R) , Refl (R) , Irrefl (R) , Tr (R) , Intr (R) , Disj (R) , Fn (R) , InvFn (R). When transitive property is 
allowed, we use the symbol S , which corresponds to ACC{ + ). Sometimes, even small differences among logics 
(e.g., limitation on the use of atomic rather than complex roles on the right or left part of an assertion) can make 
a big difference in expressiveness and complexity. Thus, it is not always easy to concisely denote DLs, that con- 
sequently constitute an ever-open research field. 

One or more among the above descripted constructs, axioms and assertions are present in the families of logics 
that are the base of languages used in ontologies. Among these, an important example is the logic slZOTQ/ D \ (in 
literature always denoted by STZOIQ(D) ), at the bottom of the OWL 2 DL profile, which will be discussed in the 
following (the letter in parenthesis indicates the use of concrete domains, that will not be discussed here). Its syntax 
is easily inferred from the symbols, while its complexity is N2ExpTiME-hard. 5 



5 By the symbol 1Z , we mean that a DL allows complex inclusions of the kind floSCR and fioSCS. 
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Construct 


Syntax 


Semantics 


top concept 


T 


T x = A x 


bottom concept 


_i_ 


1 % a 


rnnrpnt npcrsition 

LU11LCUI llCtidLlUll 


—iC 


( _ \T\ 

1 — " \° 


concept intersection 


cnD 


(c n d) x = c x nD x 


concept union 


CUD 


(C U D) x = C x U D x 


universal restriction 


VR.C 


(\/R.C) x = {»EA I |(v(a,l)e fl z ) (b £ C x ) } 


existential restriction 


afl.c 


(3fi.C) x = (a G A x 3 (a.b) £ R x A b £ C x \ 


numerical restriction 


< nR 
> nR 


(< ,i_R) x = |a £ A x | # {b G A x (a. b) e _R Z } < ?i} 
(> nfl) x = {a G A x | # {b £ A x | (a, b) G JJ 1 } j > n) 


qualified restriction 


< nR.C 
> nR.C 


(< nR.C) x = (a G A x | # |b G C x | (a, b) £ R x \ < nX 
(> nR.C) x = (a G A x # |b G C x | (a, b) G ii x ) > n) 


nominals 


{ai , . . . , a n } 


la\ x = \a x \ 
{ai , . . . , a„} x — |a x , .... a x j* 


self concept 


^R.Self 


(BR.Self) x = |a e A x | (a, a) G _R Z | 


iiniA/prcal rnlp 

U111VG1 5(11 1U1C 


U, V 


f7 x — A x x A x 


PTTiTih' rnlp 

G1I1ULV 1 U1C 


N A 


/y-I _ (T) Y pi 
j \ — yj s\ yj 


i"r»lp irnrprcp 


R~ 


(R~\ X — I In h\ \ ih n\ £ R X \ 


role negation 


-iR 


f^Rl x — f A 1 x A x "l \ R x 


role intersection 


R n S 


(R.ns) x = R. x n s x 


role union 


R U S 


(R U S) x = R x U S x 


chaining 


Ro S 


(R o S) x = R x o S x 


concept identity 


id (C) 


(id(C)) x = {(a, a) \a e C x } 


transitive closure 


R+ 




refl-trans closure 


R* 


(iT) x = (ii+) x U(id(T)) x 


role restriction 


Ro\ 

R\D 
Rc\n 


(_R C ,) X = {(a, 6) G fl x |a G C x } 
(fl|D) X = {(a,b) G i? x |b G D x } 
(R clD ) x = {(a, b) G -R x |a G C x A b G -D x } 



Tafo/e i.i. Main constructs for concepts and roles 



Syntax 


Semantics 


C = D 


C I = D I 


C\ZD 


C 1 C D x 


C(o) 


a x £C x 


-.C(a) 


a x $ C x 


i? s S 


R X = S X 


EES 


R x C S x 


7? (a, 6) 
~.7i(a,fo) 


(a, bf £ R x 
(o, bf £ R x 


Rn S = N, Disj (R, S) 


R x n S x = x 


Refl (R) 
-^Refl (R) 


(Va 6 A 1 ) ((a, a) € i? 1 ) 
(Va £ A 1 ) ((a, a) £ i? 1 ) 


Syrn (R) 


(V(a,6) G i? 1 ) ((6, a) G i? 1 ) 


^Sym (R) 


(V(a,b) G R x ) ((6, a) ^ i? x ) 


Trans (R) 
-^Trans (R) 


(V(a, b) G ((6, c) G R x — > (a, c) G R x ) 
(V (a, 6) G i? 1 ) ((6, c) G R x —t (a, c) ^ R x ) 



Table 1.2. Main types of axioms and assertions on concepts and roles 



1.3 Semantic Web 

The so-called Web 2.0 describes the current model of information search. The difference with Web 1.0 is rep- 
resented by the change in the origin of this information (from below-users-rather than from above-webmasters). 



6 The use of Trans (R), Refl (R), and so on, looks like adding a new type of axiom. Actually, they must be regarded as abbreviations of more 
complex expressions involving equivalence, subsumption and various constructs which we will not be treated here. 
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Yet, a paradigm change in its fundamental structure has not occured, as the creator of Web, Tim Berners-Lee, 
wished instead. Indeed, contents are still exclusively made up of pages connected by links, whose nontrivial 
words, together with metadata, are indexed by search engines, by means of which a correspondence between 
those and the one inserted by users can be found. In this way, notwithstanding the high efficiency and precision of 
the algorithms involved, such engines constitute a "stupid" example of information retrieval, where the matching 
between terms, even if advanced, weighs more than their meaning. 

The change brought by the so-called Semantic Web {SW or Web 3.0) addresses the main issue that in the Web, 
as we know it, most of the contents are structured in order to be read by humans, rather than investigated by 
programs in an automatic way. A computer can jump from page to page by following links, but it does not make 
any assumption on the semantics of their contents. SW is an extension of "previous" Web that allows one to assign a 
meaning to information and provides machines with the ability to elaborate and "understand" what in the past they 
could only show. For all this to work, computers should be able to access well-structured information schemas and 
apply inference rules to permit automatic reasoning, in order to delocalize knowledge representation and spread it 
over the Net. The languages these rules apply should be expressive enough to make the Web capable of "reasoning" 
in a versatile and widespread way. That is made possibile by tools such as eXstensible Markup Language {XML), 
Resource Description Framework (RDF) and the ontologies described by the Ontology Web Language (OWL). 

1.4 RDF graphs 

RDF is a model for representation of information on the Web. The basic idea is that every resource (concept, 
class, property, object, value, etc.) can be univocally described in terms of simple or identified-by- value properties. 
All this permits to schematize an RDF model by means of an oriented graph, whose nodes represent primitive 
objects and whose edges denote properties. The limited vocabulary of RDF is extended by RDFS {RDF Schema), 
which allows one to define classes and properties in a more powerful way. E.g., it is possibile to define a property 
as a relation by indicating its domain and/or range, or say what class is a subclass of another. 

From a data structure point of view, an RDF graph may be seen as a set of triples of type (subj ect predicate 
object). For each triple, subject and object are two nodes connected by an edge that represents a predicate; 
thus, the subject of a triple may be the object of another and viceversa. Each of these nodes may be identified by 
a URI/IRI, and so it can represent a resource, or it may be simply a "connector" (in this case, it is called a blank 
node, bnode, or anonymous node). If an object is not a subject of another triple, it may also be a literal, i.e., a datum 
representing a value in a certain domain. 

On the representation side, W3C suggests five types of serializations for an RDF graph, the most used of which 
are RDF/XML (XML language is used) and Turtle (descriptions through lists of triples). 7 XML is used because of its 
precise representation of data. Thus, they can be shared, by means of RDF, among various Web applications while 
preserving their original meaning. 

1.5 Ontologies 

OWL ontologies extend RDFS vocabulary further. 8 If RDFS guarantees generality and precision in constructing 
knowledge representation, with OWL we reach such an expressiveness that we can "argue" about described inform- 
ation and "extract" more, through reasoning. It provides powerful tools to define classes (that represent concepts), 
properties (relations among classes) and individuals (their instances), and gives the chance to combine them in 
logic constraints by means of which necessity and/or sufficiency may be expressed. 

Upon a W3C recommendation, OWL is made up of three sublanguages with increasing expressiveness: OWL 
Lite, OWL DL and OWL Full. The first two have an almost total correspondence between two peculiar DLs (resp. 
SHIT '^and SHOIAT(d))- 9 The last one uses the same subset of OWL DL constructs, but allows them to be un- 
constrained, according to RDF style. Due to the lack of restrictions on transitive property and to the possibility 
of handling concepts as individuals {metamodeling) , OWL Full turns out to be undecidable and thus, for the cor- 
respondence between that and any of the aforesaid DLs, OWL DL is the most expressive decidable sublanguage of 
OWL. 10 

OWL 2 extends OWL by enhancing its features and expressiveness. Apart from a limited number of cases, OWL 
2 is perfectly "backward compatible" with the old OWL (which we call OWL 1 to differentiate it from the OWL 2), 
i.e., all the ontologies of the latter keep the same semantics as that they had before, even if "fed" to a reasoner for 
OWL 2. Concisely, OWL 2 introduces a simplification into the writing of the most common statements, it permits 



7 The serialization of an object is the process which allows one to represent it in an accessible way, in our case a text file. 

8 Computer science ambitiously draw on the philosophical term ontology to show that the OWL language can describe the world starting 
from its ultimate constituents and from relations among them. 

9 OWL-Lite and OWL-DL provide the possibility of defining annotations, which correspondent DLs do not do. In any case, annotations 
affect neither reasoning, nor complexity, nor decidability. For the sake of precision, OWL-Lite is considered a "syntactic subset" of OWL-DL. 

10 Obviously, nothing excludes there is another decidable and more expressive-but not studied yet-sublanguage. 
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metamodeling, together with some new constructs which increase its expressiveness, and it extends support to 
data types. 

As for sublanguages, in OWL 2 we can deal with OWL 2 Full and OWL 2 DL. As mentioned before, the latter 
corresponds to the sTZOIQ^d) logic. Though the former may be syntactically seen as the union of the latter with 
RDFS, it is semantically compatible with OWL 2 DL(insofar as its semantics allows one to draw all the inferences 
that can be drawn by using the semantics of OWL 2 DL). Similarly to the analogue sublanguage OWL 1 Full, OWL 2 
finds application in the modelling of concepts where reasoning is not required. A new feature of OWL 2 is the use 
of profiles (i.e., fragments of language) OWL 2 EL, OWL 2 QL and OWL 2 RL: 11 the first is useful for applications in- 
volving ontologies that contain a great number of classes and/or properties; the second aims at those applications 
with large volumes of instances, where answering the queries is of primary importance (from which the name QL, 
Query Language); the last profile is useful in applications not requiring the sacrifice of too much expressiveness 
to efficiency. We will not delve into the structural characteristics of these profiles. Here, we only say that they are 
grounded on very different DLs, each of them fit for the specific purpose they are designed for. 

1.6 Syntactical correspondence of OWL(2) with DLs 

As to its practical representation, an ontology is a RDF/XML-structured text file, but, for our goals, it will be 
more useful to think to it in terms of an RDF graph and, thus, of triples. Table 1.3 describes the correspondence 
between the most common constructs of DLs and such triples (or a single resource, where applicable), which use 
RDF(S) syntax together with that of OWL(2). It also contains the names given by OWL to the fundamental concepts 
and roles of DLs. It is interesting to notice how a DL construct often corresponds to more triples, which sometimes 
makes interpretation quite hard. The symbols _:x and _list in the Table 1.3 indicate respectively a blank node 
and a list (a structure we will not deal with), whereas the prefixes before the elements of triples represent standard 
W3C namespaces, which serve as abbreviations for the URIs they refer to. 12 

1.7 SPARQL 

One of the most widespread languages for querying against ontologies is SPARQL ( recursive acronym for 
SPARQL Protocol And RDF Query Language) . Although it has lots of analogies with SQL, most of which syntactical, 
it is equipped with a fundamentally different semantics. As query languages for databases essentially work on 
tables and handle logic conditions to select the rows of these tables that satisfy them, the WHERE clause in SPARQL 
finds matches between the triples of the query and those of the ontology indicated in the FROM clause. The logic 
assessments are relegated to the FILTER operator, which possesses a big expressive power, thanks to its several 
functions, but is rarely used due to the loss of efficiency that its presence may cause. 13 The SELECT clause is very 
similar to that of SQL. It accepts DISTINCT as a keyword and the classical aggregation operators (COUNT, SUM, MIN, 
AVG and MAX), while the GROUP BY and HAVING clauses are often used (another similarity) to refine the selection. 
The involved variables are preceded by the symbol '?', whereas constants do not have a particular syntax, even if 
they generally coincide with URI/IRIs of resources present in the ontology one is handling. 

Analyzing in finer detail the WHERE clause, one can say that the triples to match are enclosed in a block between 
braces and separated by a dot, which implies their intersection. Inside these braces, more sub-blocks may be 
found, which are useful in making the union (UNION operator) and the difference (MINUS operator) between sets. 
The matching is valid if the variables having the same names in the clause have the same values in the ontology. 
There exist some useful shortenings, such as the use of semicolon or colon in place of dot, which act respectively 
in the following way: 

?s ?pl ?ol ; ?p2 ?o2 . shortens ?s ?pl ?ol . ?s ?p2 ?o2 . 

?s ?p ?ol , ?o2 . shortens ?s ?p ?ol . ?s ?p ?o2 . 

In addition to selection, one can also have the DESCRIBE, ASK and CONSTRUCT query types, which we will not 
review here. The employment of the PREFIX clauses, that precede the real query and indicate the abbreviations 
for namespaces used inside of it, is quite peculiar. 



n In this way, OWL 1 Lite, OWL 1 DL and OWL 2 DL maybe considered profiles of OWL 2. 
12 The correspondences between namespaces and prefixes are: 

rdf : http : //www . w3 . org/1999/02/22-rdf -syntax-ns# 

rdf s : http : //www . w3 . org/2000/01/rdf - schema* 

owl : http : //www . w3 . org/2002/07/owl# 

xsd : http : //www . w3 . org/2001/XMLSchema# 

13 Depending on implementation, the FROM clause may be implied because software loads in memory the ontology model separately. 
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DL construct 


RDF(S)/0WL(2) resource/triple (s) 


\ 


owl : Thing 


1 

_L 


owl : Nothing 


O L_ JJ 


(.0 rdf s : subClassUf JJ) 


U — u 


vO owl : equivalentulass JJ) 


C L — iU Opp. O 1 1 JJ L_ _L 


CO owl : disj ointwitn JJ) 


/~» i— i z^f i— i r— i /m 
Oi 1 1 O2 1 1 ■ • ■ 1 1 On 


:x owl : intersect lonUi _list^Ci, 02, • ■ • , OnJJ 


n 1 1 r( 11 1 1 
Oi U O2 U . . . U On 


v_ : x owl :unionL)i _list[Oi, 02, • ■ • , t~y n )) 




(_ : x owl : complementOf C) 


{al, 02, ■ ■ ■ , 0>n} 


(_ : x owl : oneOf _lis"t(oi, 0,2, ■ ■ ■ , &n)) 




(_:x owl : some Value sFrom C) (_;x owl : onProperty 7?) 


up f 


(_:x owl : allValuesFrom C) (_.x owl : onProperty 7?) 


37?. {a} 


(_:x owl:hasValue a) (_:x owl : onProperty R) 


< nTi 


(_ : x owl :minCardinality n) (_ : x owl : onProperty R) 


> nR 


(_:x owl :maxCardinality n) (_:x owl : onProperty it") 


< nR n > nR 


(_:x owl : cardinality n) (_:x owl : onProperty i?) 


< nR.C 


(_:x owl :minQnalif iedCardinality n) 
C_:x owl:onClass C) C_:x owl : onProperty it; 


> nR.C 


(_:x owl :maxQnalif iedCardinality n) 
C_:x owl:onClass C; C_:x owl : onProperty it; 


< nR.Cn > nR.C 


(_:x owl :qualiiiedCardmality n) 
(_:x owl:onClass C) (_:x owl : onProperty it*) 


3R IpI f 


l *TT riTjl "Via^SpTf f T"\l Pi ( ' TT ot.tT * OTiPt'OTIPT'I" V r? ) 


C(a) 


(a rdf : type C) 


{ail = laul 

L * J L J J 


(ai owl:sameAs dj) 


{a^i LZ —1 {a, | 

L ' J — L J J 


(cii owl : dif f erentFrom aj) 


77 nnn V 




A r A 
i V Opp . i-A 


owl : bottomOb j ectProperty 


D 1 — C 


(7? owl : subPropertyOf S) 


T> — Q 
H — O 


(R owl : equivalentProperty S) 


T? r~i c 1 — 7VT 

Alio j— iV 


(.R owl : PropertyDis j ointWith S) 


H 


(_:x owl : inverseOf 7?) 


itl O • ■ ■ -fin L= -ft 


(7? owl :propertyChainAxiom _list(7?i, R2, ■ ■ ■ , R n )) 


D„fl / E>\ 


(R rdf : type owl :Ref lexiveProperty) 


Irreji yti) 


(R rdf: type owl : Irref lexiveProperty) 


bym [H) 


(R rdf : type owl : SymmetricProperty) 


Asym [Jri) 


(7? rdf : type owl : AsynunetricProperty) 


Trans (R) 


(7i rdf : type owl :TransitiveProperty) 


Fn (R) 


(7i rdf : type owl : FunctionalProperty) 


InvFn (R) 


(7? rdf: type owl : InverseFunctionalProperty) 


R (a, b) 


(a property _name b) 


-^R(a,b) 


(_:x rdf: type owl :NegativePropertyAssertion) 
(_:x owl : sourcelndividual a) 
(_:x owl : assertionProperty 7?) 
(_:x owl: target Individual 6) 



Table 1.3. Correspondences between DLs and SW 



2 ANALYSIS AND RESULTS 



2.1 The family £>£<V£> 

Description Logics derived from decidable fragments of set theory, generally denoted by the notation 
2?£<language_name>, are having considerable importance. 

The family of logics underlying this class of ontologies, object of research this project is based on, is focused 
on the fragment Vq , which has a good expressiveness w.r.t. knowledge representation in real-world applications. 
The interest aroused by that is related to NP- completeness of its decision procedure in some cases of practical 
relevance. £>£<Vq >-concepts and 2?£<Vq >-roles are formed according to the syntax 14 

C,D -> A | T | _L | -nC | C n D | C U D | {a} | BR.Self | 3R. {a} 

R,S -> P\U\R-\->R\RnS\RUS\Rc\ \R\ D \R C \ D \id{C) |sym(E) , 

while a P£<VJ>-KB is made of axioms and assertions of the following kind: 

C = D CQD CCVi?.L> 3R.Cn D C (a) R(a,b) 
R = S RQS RoR> \ZS Trans (i?) Refl (R) Asym (R) Fn (R) InvFn (R) . 

We may notice that universal restriction is allowed only in the right part of a subsumption, whereas existential 
restriction can appear only in the left part. In addition, neither numerical, nor qualified restrictions are allowed. 

2.2 Description of the experiment 

Our analysis is aimed at selecting ontologies corresponding to the P£<VJ> family, in order to use them as a 
base of study for this family of logics in real-world applications. To do that, it is necessary to query against as many 
as possible publicly available ontologies in a quick and efficient way The use of queries in SPARQL conveniently 
lends itself to the purpose. The management is provided by dOWLphin, a Java library specifically created in order 
to easily load the ontologies and prepare the queries. The underlying library is Jena, one of the most common 
collections of API for the Semantic Web, that is very handy because it can directly deal with triples. As a front-end 
GUI, the program QuAny was used, which was born inside the experiment too and allows one to query against 
local and remote ontologies, and save on disk the corresponding results, for future verification. 

2.3 Results and future objectives 

The query was employed to test a significant number of ontologies of BioPortal, the web portal of the National 
Center for Biomedical Ontology. This choice was not random, but motivated mainly by two reasons: the first 
concerns the large amount of ontologies present on the portal, coming from repositories of biomedical resources 
spread all over the world; the second is relative to the connection that these ontologies have with the real world in 
general, and with human life and medicine in particular, which are fields offering several matters for reflection on 
how widely knowledge of so important themes may be schematized and represented, and, most of all, on the role 
reasoning may have in automatically inferring new information. 

Around 30% of the ontologies resulted member of the language VC<\/^>, which brings good hopes for reas- 
oning on them, given that-we remind-w.r.t. computational complexity we are in the NP-completeness realm. 
Concerning the remaining 70%, efficient algorithms for conversion will have to be considered and semantic tests 
will have to be done, as previously happened for the 2?£<MLSS| > language (cfr. 01). 

In Appendix[Bl a table shows which ontologies were recognized as members of the language £>£<V5>. 



The symbol sym (R) denotes symmetric closure of R. 
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A ANALYSIS OF THE SPARQL QUERY 



As we are going to see in the implementation, to write declaratively what an algorithm could do through a list 
of well-constructed statements is not an immediate operation to undertake. The final query is the result of vari- 
ous logical lines of argument concerning the conversion from P£<VJ>-constructs into those of RDF(S)/OWL(2), 
opportunely applied to some test ontologies. 

From Table 1.3, one infers that, for the ontology under examination to be considered part of the family of logics 
seen before, the allowed elements of RDF (S)/ OWL (2) are the following: 

• class or properly definition (rdf : type) 

• top concept (owl : Thing) 

• bottom concept (owl : Nothing) 

• concept negation (owl : complementer ) 

• intersection and union of concepts (resp. owl : intersectionOf and owl :unionOf ) 

• singleton (owl: oneOf) 

• self concept (owl: hasSelf) 

• value restriction (owl : has Value) 

• universal role (owl : topOb j ectProperty) 

• role inverse (owl : inverseOf ) 

• intersection and union of roles (not present in OWL 2) 15 

• role restrictions (not present in OWL 2) 

• identity concept (not present in OWL 2) 

• symmetric closure (it can be emulated by means of union and inverse of roles) 

• equivalence axiom between concepts (owl : equivalentClass) 

• subsumption axiom between concepts (rdf s : subClassOf ) 

• exiistential restriction (owl : someValuesFrom) only in the left part of a subsumption 

• universal restriction (owl : allValuesFrom) only in the right part of a subsumption 

• minimal numerical restriction (owl : minCardinality) with cardinality not greater than 1, only in the left part 
of a subsumption 

• declaration of concept membership (rdf : type) 

• declaration of role membership 

• equivalece between roles (owl : equivalentProperty) 

• inclusion between roles (owl : subPropertyOf ) 

• role chaining (owl : PropertyChainAxiom) 

• transitive property (owl : TransitiveProperty) 

• reflexive properly (owl : Ref lexiveProperty) 

• asymmetric property (owl : AsymmetricProperty) 

We have to add to these all the ones that can be deduced from a combination of them, such as 

• disjunction between concepts (owl :disjointWith, owl: members) 

15 Obviously, we may ignore the constructs not present in OWL 2, since they cannot be a cause of rejection of the ontology. 
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• union of disjoint concepts (owl : dis j ointUnionOf ) 

• equivalence and non- equivalence between individuals (resp. owlrsameAs and owl :dif f erentFrom) 

• empty role (owl:bottomObjectProperty) 

• role domain (rdf s : domain, some cases excluded) 16 

• role range (rdf s : range) 

• disjunction of properties (owl : propertyDis j ointWith, owl : members) 

• irreflexive property (owl : Irref lexiveProperty) 

• symmetric property (owl : SymmetricProperty) 

• direct and inverse functional property (resp. owl : FunctionalProperty, owl : InverseFunctionalProperty) 

• declaration of role non-membership (owl : NegativePropertyAssertion) 

Consequently the constructs which cause the rejection of an ontology are: 

• existential restriction not in the left part of a subsumption 

• universal restriction not in the right part of a subsumption 

• minimal numerical restriction with cardinality greater than 1 or not in the left part of a subsumption 

• qualified numerical restrictions (owl : [max I min] Qualif iedCardinality, owl : qualif iedCardinality) 

• maximum and exact unqualified numerical restrictions (resp. owl :maxCardinality, owl : cardinality) 

• some cases of domain declaration 17 

• datatype (rdf s : Datatype, owl : DatatypeProperty) 

In order to create the query, we must consider that the language D£<Vq > is very expressive and provides 
constructs and axioms which range over several elements of RDF(S)/OWL(2). Thus, to abbreviate the task of enu- 
merating them, the criterion of selecting the triples not related to the language was adopted. Consequently, the 
result of the query will contain at least a triple if the examined ontology is to be rejected. There are as many clauses 
as the elements to consider, connected by the UNION operator; thus, if at least a match is found, a non empty result 
will be obtained, i.e., the analyzed ontology will not be a member of the language. 

Now, let us analyze the correctness of the query in detail (the relative row number is indicated in brackets, 
when necessary). 

• The first clause (lines 7-19) includes the triples corresponding to existential restrictions (line 8). Yet, not all of 
them contribute to the rejection of the ontology, but only those which are not in the left part of a subsumption. 
Thus, in line 10 we force their exclusion, provided that any is found. If this is not the case, we find a match, 
according to line 8; otherwise, we must control that neither of the following cases occurs (lines 11-17): 18 

- the anonymous node corresponding to a restriction is in the right part of any triple (line 12): that allows 
one to include all the axioms with the restriction and the various operators (e.g., union, intersection, 
complement, etc.) which work on lists and single classes in the right part; 

- the above-mentioned node is in the left part of a triple corresponding to an axiom (lines 13-16): that 
allows one to include single and multiple equivalences and disjunctions. 

• The second clause (lines 19-33) includes the triples corresponding to universal restrictions (line 20). When a 
triple of this kind is found, in order for a matching to be valid, it is also necessary to check that at least either 
of the following cases occurs (lines 22-31): 

- the anonymous node corresponding to a restriction is in the right part of a triple (line 23), but this triple 
is not a subsumption (line 24), nor that node is the domain of a property (line 25); 

16 Role domain corresponds to the DLs' axiom 3R.T C C, so it is necessary to verify that C is satisfactorily expressed by the language. 
17 Cf. footnoterrSl 

18 The second-level MINUS operator could be optimized by including the various triples contained in its clauses after line 8; nonetheless, we 
must not forget that the matching of triples is a notoriusly not very much efficient operation: the choice of double-nesting concurs to avoid 
that SPARQL engine performs useless searches when no subsumption in line 10 is found. 
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- the above-mentioned node is in the left part of a triple (line 27), but it is not part of the definition of the 
restriction itself (lines 28-30) (otherwise, an ever- false condition would be represented) : that is to control 
that the restriction is neither in the left part of an axiom, nor in the left part of some class construct. 

• The third clause (lines 33-46) includes the triples corresponding to an operator of minimal cardinality (line 
34). Here we proceed (line 36 and lines 39-43) as in the first clause, with the difference (line 37) that the 
cardinality value must be not greater than 1 too. 19 

• The remaining clauses (lines 47-51) check for datatype (lines 47 and 48), qualified cardinality (line 49: the 
onciass property is peculiar to this kind of restriction), exact and maximum cardinality (lines 50 and 51) 
triples. 

• For the purpose of accelerating the execution, the use of the LIMIT operator (line 52) reduces the size of 
results to only one element, which is enough to reject the ontology. 



1 PREFIX rdf : <http://www.w3.Org/1999/02/22-rdf-syntax-ns#> 

2 PREFIX rdfs: <http://www.w3.Org/2000/01/rdf-schema#> 

3 PREFIX owl: <ht t p : / / www . w3 . or g / 200 2 / 07 / owl#> 
4 

5 SELECT * 

6 WHERE { 

7 { 

8 ?left owl : someValuesFrom ?class . 

9 MINUS { 

10 ?left rdf s : subClassOf ?right . 

11 MINUS { 

12 { ?s ?p ?left . } UNION 

13 { ?left owl : equivalentClass ?right . } UNION 

14 { ?left owl : dis j ointWith ?right . } UNION 

15 { ?left owl : members ?right . } UNION 

16 { ?left owl : dis j ointUnionOf ?right . } 

17 } 

18 } 

19 > UNION { 

20 ?restr owl : allValuesFrom ?class . 

21 { 

22 { 

23 ?x ?prop ?restr . 

24 MINUS { ?x rdf s : subClassOf ?restr . } 

25 MINUS { ?x rdfs: domain ?restr . } 

26 } UNION { 

27 ?restr ?prop ?x . 

28 MINUS { ?restr rdf : type ?x . > 

29 MINUS { ?restr owl : onProperty ?x . } 

30 MINUS { ?restr owl : allValuesFrom ?x . } 

31 } 

32 } 

33 > UNION { 

34 ?left owl : minCardinality ?num . 

35 MINUS { 

36 ?left rdf s : subClassOf ?right . 

37 FILTER (?num<=l) . 

38 MINUS { 

39 { ?s ?p ?left . } UNION 

40 { ?left owl : equivalentClass ?right . } UNION 

41 { ?left owl : dis j ointWith ?right . } UNION 

42 { ?left owl : members ?right . } UNION 

43 { ?left owl : dis j ointUnionOf ?right . } 

44 } 



FootnotefTfflholds also for the nested MINUSes. 
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45 } 

46 > UNION 

47 { ?s ?p owl : DatatypeProperty . } UNION 

48 { ?s ?p rdfs : Datatype . } UNION 

49 { ?s owlionClass ?o . } UNION 

50 { ?s owl : cardinality ?o . } UNION 

51 { ?s owl : maxCardinality ?o . } 



52 } LIMIT 1 



B TEST AGAINST BIOPORTAL ONTOLOGIES 



The SPARQL query was tested against the ontologies of BioPortal, available at the link http: //rest .bioontology.org/biopoi 
The API KEY is an identifier for the registered users. The following table associates the ID of the previous URL to 
the symbolic name of the corrispondent ontology and highlights the ones which belong to the language. 



ID 




Res 


ID 


j\7/7 j i J rt 

i \tAiim 


Res 


ID 


I \Uffttt 


Res 


1033 


NMR Metabolomics Investig. 


Y 


1362 


Hymenoptera Anatomy 


N 


1552 


Reprod. Trait and Phenotype 


Y 


1039 


Proteomics Data 


N 


1369 


PhysicalFields 


Y 


1560 


Cognitive Paradigm 


N 


1052 


Proteins 


N 


1381 


NIF Dysfunction 


Y 


1565 


Medical Social Entities 


Y 


1054 


Amino -acid 


N 


1393 


Information Artifacts 


N 


1567 


Pharmacovigilance 


Y 


1056 


Basic Vertebrate Anatomy 


N 


1394 


Syndromic Surveillance 


N 


1569 


Host Pathogen Interactions 


N 


1058 


SNP 


N 


1398 


Language Disorder in Autism 


Y 


1570 


Traditional Med. Constitution 


Y 


1059 


Computer-based Patient Record 


N 


1399 


Pilot Ontology 


N 


1571 


Traditional Med. Other Factors 


Y 


1060 


Epoch Clinical Trial 


N 


1401 


Nursing Practice 


N 


1572 


Trad. Med. Signs and Symptoms 


Y 


1061 


Pharmacogenomics 


N 


1402 


NIF Cell 


N 


1573 


Traditional Med. Meridian 


Y 


1068 


Subcellular Anatomy 


N 


1406 


LinkingKing2PEP 


N 


1576 


FDA Med. Devices 


N 


1082 


Gene Regulation (GRO) 


N 


1407 


Description of Dynamics 


N 


1578 


HOM-Helixhauser Scores 


N 


1083 


NanoParticles 


N 


1409 


PKORe 


Y 


1580 


Adverse Event Reporting 


N 


1084 


NIFSTD 


Y 


1410 


Kinetic Simulation Algorithm 


N 


1581 


Health Indicators 


N 


1086 


Disease Genetic Investigation 


N 


1411 


Functioning, Disability and Health 


N 


1582 


CAO 


N 


1087 


Geographical Regions 


Y 


1413 


Software 


N 


1588 


General Purpose Datatypes 


N 


1088 


MaHCO 


N 


1414 


General Medical Science 


Y 


1596 


HOM_MDCs-DRGS 


N 


1089 


BIRNLex 


N 


1415 


CTCAE 


N 


1613 


Bone Dysplasia 


N 


1092 


Infectious Diseases 


N 


1417 


Influenza 


N 


1615 


Chemogenomics 


N 


1100 


Genetic Intervals 


N 


1418 


TOK 


N 


1616 


Phylogenetics 


Y 


1104 


Biomedical Resource 


N 


1438 


Breast tissue cell lines 


N 


1625 


HOM-ICD9PCS 


N 


1106 


Gene Regulation (BOOTStrep) 


N 


1439 


General Formal (GFO) 


N 


1627 


HOMERUN Metadata 


N 


1116 


Bleeding History Phenotype 


N 


1440 


General Formal (GFO-Bio) 


N 


1629 


HOM-UCARE Demographics 


N 


1122 


Skin Physiology 


N 


1444 


Chemical Information 


N 


1631 


HOM-Harvard 


N 


1123 


Biomedical Investigations 


N 


1461 


Translational Medicine 


N 


1633 


Cognitive Alias 


N 


1126 


Family Health History 


N 


1484 


External Causes of Injuries 


N 


1638 


Data Mining 


Y 


1128 


Comparative Data Analysis 


N 


1487 


Body System 


Y 


1639 


Epilepsy 


N 


1130 


Cancer Research and Mgmt 


N 


1488 


SysMO-IERM 


N 


1640 


Pediatric Terminology 


N 


1131 


MGED 


N 


1489 


Adverse Events 


N 


1641 


HOM-ICD9CM-ECODES 


N 


1134 


BioTop 


N 


1494 


Tissue Microarray 


N 


1642 


HOM-DXPROCS MDCDRG 


N 


1136 


Experimental Factors 


Y 


1497 


PMA2010 


N 


1643 


HOM-ICD9_PROCS OSHPD 


N 


1141 


Physics for Biology 


N 


1500 


RNA 


N 


1648 


HOM-DATASOURCE OSHPD 


N 


1142 


Cardiac Electrophysiology 


Y 


1501 


Neomark Oral Cancer-based 


N 


1650 


Units 


Y 


1146 


Electrocardiography 


N 


1505 


MicroRNA Target Prediction 


N 


1652 


HOM-OSHPD Use cases 


N 


1149 


Dermatology Lexicon 


N 


1515 


Interaction Network 


Y 


1653 


HOM-PROCS2 OSHPD 


N 


1172 


Vaccines 


N 


1521 


Neural Motor Recovery 


Y 


1658 


Hewan Invertebrata 


N 


1183 


Lipids 


N 


1522 


BioPAX 


N 


1665 


Student Health Record 


N 


1190 


Parasite LifeCycle 


N 


1523 


OBOE-SBC 


N 


1666 


Emotion 


N 


1192 


Proteomics Pipeline Infrastructures 


N 


1524 


OBOE 


N 


1667 


HOM-DATASOURCE OSHPDSC 


N 


1237 


Situation-based Access Control 


N 


1530 


Animal natural history 


N 


1668 


HOM-OSHPD-SC 


N 


1247 


GeoSpecies 


N 


1532 


SemanticScience 


Y 


1671 


Quantitative Imaging Biomarkers 


N 


1249 


Smoking Behavior Risk 


N 


1533 


BioAssay 


N 


1672 


DIKB-Evidence 


N 


1290 


ABA Adult Mouse Brain 


Y 


1534 


Apollo-akesios 


Y 


1676 


Randomized Controlled Trials 


N 


1304 


Breast Cancer Grading 


N 


1537 


Brucellosis 


N 


1686 


Neomark Oral Cancer 


Y 


1314 


CeU Line (2) 


N 


1538 


Roles 


Y 


1696 


Synapses 


Y 


1321 


Neural ElectroMagnetic Ontology 


N 


1540 


Drug Discovery Investigations 


N 


3002 


Mental Functioning 


Y 


1332 


Basic Formal Ontology 


Y 


1541 


Cell Line (MCCL2) 


N 








1335 


Parasite Experiments 


N 


1550 


PHARE 


N 









